[ML] Correct query times for model plot and forecast #327

tveasey · 2018-12-04T13:18:20Z

We were querying for the model bounds and forecast points at the beginning of each bucket. Instead we should match the time offset we apply to bucket samples when we update the model.

The upshot was that model bounds and forecasts were (typically) offset in time with respect to the data values. The problem is particularly noticeable for long bucket lengths. For example, the figures below show the model bounds for 1 day buckets before and after the fix.

Before:

After:

hendrikmuhs · 2018-12-04T14:49:38Z

lib/api/CForecastRunner.cc

+                    core_t::TTime bucketLength{model.s_ForecastModel->params().bucketLength()};
+                    core_t::TTime startTime{model_t::sampleTime(
+                        feature, forecastJob.s_StartTime, bucketLength)};
+                    core_t::TTime endTime{model_t::sampleTime(


Would it be possible to fix it in CAnomalyJob::doForecast instead? CForecastRunner is just a dumb worker and should not have any important logic. CAnomalyJob::doForecast calls into the runner and sets startTime to m_LastResultsTime, it seems to me, that adjusting it there does the same thing but is a bit cleaner. endTime is anyway just relative to startTime.

Maybe the same can be done for model plots.

The problem is this is feature specific. So it is tricky to push it higher up if the forecast is being run over a job with multiple detectors with different features.

I could create a wrapper which implements the logic in the model library. I can't directly push the feature into the forecast function (because it is in the maths library which can't depend on EFeature). I could supply a call back to compute the offset start and end times and have this use the wrapper from the model library.

Alternatively, how about I add a function to actually run the forecast to model_t which wraps up this detail. Given we only have the maths::CTimeSeriesModel here (for good reason) this seems like it might be the cleanest option.

The problem is this is feature specific. So it is tricky to push it higher up if the forecast is being run over a job with multiple detectors with different features.

ok, I see and agree that's to complicated.

What about inside of model.s_ForecastModel->forecast(...)?

That hits the library dependency issue mentioned above. However, what about if I have a
CForecastDataSink::SForecastModelWrapper::forecast function which takes the forecast job. This could wrap all the functionality now in this loop?

sounds good, I am also ok if we keep the current version given that alternatives are to complicated.

I think I like the idea of wrapping this in SForecastModelWrapper. It seems more natural to me than in this loop which is really just about scheduling. I'll make it and see how it looks

f980f26. Note that none of the members of SForecastModelWrapper are needed outside of the new forecast function, so I converted to a class.

hendrikmuhs

LGTM

Backport #327.

droberts195 · 2018-12-06T12:23:19Z

I removed the v6.5.3 label from this PR as this was backed out of 6.5 in #330. We need to put more thought into the impact of changing results document timestamps.

tveasey · 2018-12-07T14:35:18Z

We discussed this some more. There were some misunderstandings about the nature of the change, but also there was a change to the default offsets in time buckets at which forecast points were requested. I reverted to the old style of defining the forecast points at "bucket time ", i.e. offset zero, in #332. We will target this and #332 together at 6.5.4.

Backport elastic#327.

Backport #327.

Fix query times for model plot and forecast

1f2312a

tveasey added >bug :ml v6.5.3 labels Dec 4, 2018

tveasey requested a review from edsavage December 4, 2018 13:18

droberts195 added v7.0.0 v6.6.0 labels Dec 4, 2018

hendrikmuhs reviewed Dec 4, 2018

View reviewed changes

tveasey added 2 commits December 5, 2018 10:18

Review comment

f980f26

Docs

9e959c8

hendrikmuhs approved these changes Dec 5, 2018

View reviewed changes

tveasey merged commit f54a2c3 into elastic:master Dec 5, 2018

tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request Dec 5, 2018

[ML] Correct query times for model plot and forecast (elastic#327)

2b68d39

tveasey mentioned this pull request Dec 5, 2018

[6.6][ML] Correct query times for model plot and forecast #328

Merged

tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request Dec 5, 2018

[ML] Correct query times for model plot and forecast (elastic#327)

c9b320b

droberts195 mentioned this pull request Dec 5, 2018

[CI] ForecastIT#testSingleSeries fails reproducibly on master elastic/elasticsearch#36258

Closed

tveasey added a commit that referenced this pull request Dec 5, 2018

[ML] Correct query times for model plot and forecast (#328)

bcd02cc

Backport #327.

tveasey mentioned this pull request Dec 5, 2018

[6.5.3][ML] Correct query times for model plot and forecast #329

Merged

tveasey added a commit that referenced this pull request Dec 5, 2018

[ML] Correct query times for model plot and forecast (#329)

c53a6f0

Backport #327.

droberts195 mentioned this pull request Dec 6, 2018

[ML][CI] fix integration test after change in ml-cpp#327 elastic/elasticsearch#36296

Closed

droberts195 removed the v6.5.3 label Dec 6, 2018

tveasey mentioned this pull request Dec 7, 2018

[ML] Write out forecasts predictions at "bucket time" rather than the actual times they are made in the time buckets #332

Merged

tveasey added the v6.5.4 label Dec 7, 2018

tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request Dec 12, 2018

[ML] Correct query times for model plot and forecast (elastic#328)

ce6378f

Backport elastic#327.

tveasey mentioned this pull request Dec 12, 2018

[6.5][ML] Correct query times for model plot and forecast #339

Merged

tveasey added a commit that referenced this pull request Dec 13, 2018

[6.5][ML] Correct query times for model plot and forecast (#339)

8b723aa

Backport #327.

lcawl mentioned this pull request Dec 17, 2018

[DOCS] Edits 6.5.4 release notes entries #346

Merged

tveasey deleted the bug/model-plot-forecast-time-offset branch May 1, 2019 14:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Correct query times for model plot and forecast #327

[ML] Correct query times for model plot and forecast #327

tveasey commented Dec 4, 2018

hendrikmuhs Dec 4, 2018

tveasey Dec 4, 2018 •

edited

Loading

tveasey Dec 4, 2018

hendrikmuhs Dec 4, 2018 •

edited

Loading

tveasey Dec 4, 2018

hendrikmuhs Dec 4, 2018

tveasey Dec 4, 2018

tveasey Dec 5, 2018

hendrikmuhs left a comment

droberts195 commented Dec 6, 2018

tveasey commented Dec 7, 2018

[ML] Correct query times for model plot and forecast #327

[ML] Correct query times for model plot and forecast #327

Conversation

tveasey commented Dec 4, 2018

hendrikmuhs Dec 4, 2018

Choose a reason for hiding this comment

tveasey Dec 4, 2018 • edited Loading

Choose a reason for hiding this comment

tveasey Dec 4, 2018

Choose a reason for hiding this comment

hendrikmuhs Dec 4, 2018 • edited Loading

Choose a reason for hiding this comment

tveasey Dec 4, 2018

Choose a reason for hiding this comment

hendrikmuhs Dec 4, 2018

Choose a reason for hiding this comment

tveasey Dec 4, 2018

Choose a reason for hiding this comment

tveasey Dec 5, 2018

Choose a reason for hiding this comment

hendrikmuhs left a comment

Choose a reason for hiding this comment

droberts195 commented Dec 6, 2018

tveasey commented Dec 7, 2018

tveasey Dec 4, 2018 •

edited

Loading

hendrikmuhs Dec 4, 2018 •

edited

Loading